Added win64 ports of ThreadX and ThreadX SMP#529
Open
fdesbiens wants to merge 18 commits into
Open
Conversation
Added the win64 port sources, CMake integration, and Windows build and test scripts. Updated shared initialization and regression infrastructure for MSVC-hosted Windows simulation, and adjusted Windows-host timing tolerances in regression tests to keep the suite stable.
Replaced coarse Win64 scheduler polling with an event-driven wake path and switched the simulated timer to one-shot rearming to avoid catch-up ticks. Reduced Win64 regression slow-timer settings to 10 ms for the stable configurations, while keeping disable_notify_callbacks_build at 15 ms for reliability. Hardened the Windows build wrapper by invoking Ninja directly for Ninja build trees, fixing timeout detection, enabling a default build timeout, and limiting fallback command replay to real timeout cases.
Restored Linux builds in scripts/build_tx.sh by making the regression tx_initialize_low_level generator tolerant of port-specific formatting. Replaced the brittle exact-string insertion logic with line-based matching so the test interrupt dispatcher hook was inserted reliably for both Linux and Windows simulator ports.
…cheduler timeout Three targeted fixes that together produce a 20% overall speedup across the SMP regression suite (150.3s -> 124.8s) with no regressions (all 109 tests pass). ## Fix 1 - Skip SuspendThread when _tx_thread_preempt_disable != 0 (tx_thread_context_save.c / tx_thread_context_restore.c) When the timer ISR fires while a ThreadX thread is inside a TX_DISABLE section (_tx_thread_preempt_disable != 0), the old code called SuspendThread() / ResumeThread() unconditionally, wasting ~100 us per tick for zero benefit: context_restore would always skip preemption in that state because the ISR cannot lower _tx_thread_preempt_disable below its value at ISR entry while it holds the Win32 critical section. New behaviour: - context_save sets suspension_type = 3 (new port-local state) instead of calling SuspendThread, letting the thread continue automatically once the critical section is released. - context_restore clears suspension_type 3 without calling ResumeThread. This is the primary driver of the improvement (e.g. threadx_thread_ delayed_suspension_test: 18.07s -> 2.29s, 7.9x speedup). Also fixes a latent bug in context_restore where ResumeThread was called even when context_save had skipped SuspendThread because mutex_access was TRUE (thread spinning on the Win32 CS). ## Fix 2 - 2 ms scheduler event timeout (tx_initialize_low_level.c, _tx_win32_wait_for_scheduler_event) Matches the Linux SMP port's sem_timedwait(2 ms) pattern. - Prevents indefinite stall on any missed SetEvent(). - Introduces slight timing jitter that helps break the systematic phase resonance observed in threadx_thread_wait_abort_and_isr_test, where the timer tick was always landing outside the _tx_thread_preempt_disable window, requiring far more ticks to accumulate 20 condition hits. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When context_save fires and the current thread has mutex_access == TRUE (it is spinning on the Win32 critical section waiting to acquire it), calling SuspendThread is wasteful: the thread cannot execute any protected ThreadX code while blocked on the spinlock and will proceed naturally once the ISR releases the CS. Flag such threads with suspension_type 4 so context_restore skips the matching ResumeThread. Two subtle bugs were fixed along the way: 1. Thread-ID scan instead of stale-TLS lookup _tx_win32_critical_section_obtain previously used the TLS variable _tx_win32_current_virtual_core to find the calling thread's struct. That TLS is only refreshed via the run-semaphore wake path (type 2); after a type-1 (SuspendThread/ResumeThread) hand-off the TLS can point to the wrong virtual core. A thread on core N with stale TLS=M would stamp mutex_access = TRUE on _tx_thread_current_ptr[M], which is a completely different thread. The fix scans _tx_win32_virtual_cores[] by OS thread ID to find the correct struct. 2. context_restore no-preemption path: drop mutex_access guard The original defensive else in context_restore skipped ResumeThread when mutex_access was TRUE. After fix eclipse-threadx#1 this situation still has a narrow race window (set before CAS, clear after CAS; timer ISR can land between them). If the ISR had already called SuspendThread (suspension_type == 0) before mutex_access was stamped, the defensive else would leave the thread permanently OS-suspended — deadlock. The fix relies solely on suspension_type: type 3/4 => no SuspendThread was issued => no ResumeThread; any other type => ResumeThread always. Results (109/109 tests pass): Round-1 baseline : 124.8 s This commit : 105.7 s (-15.4 %, total -32 % vs original 150 s) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…scan Three targeted optimisations over the Round 2 baseline (105.7 s): 1. Restore start_ack rendezvous in scheduler (revert Round 3 Phase 2) The fast-rendezvous approach (removing the start_ack wait) degraded wait_abort_and_isr from 11.98 s to 30 s because the CS-hold during start_ack provides the timing window where other threads spin with preempt_disable != 0, which the ISR must observe to satisfy the test condition. Reverted tx_thread_schedule.c to the Round 2 state and updated the explanatory comment in tx_thread_context_restore.c. 2. Increase TX_WIN32_CONTENTION_PAUSE_COUNT 64 → 256 Each time a thread spins on the Win32 critical section and fails a CAS, it increments a counter; on reaching the threshold it calls _tx_win32_thread_yield() (SwitchToThread/Sleep(0)) and resets the counter. With a threshold of 64, heavily contended tests triggered SwitchToThread() on every 64th failed CAS — extremely expensive (~50 µs/call). Raising to 256 reduces the call rate 4x while keeping the same eventual-yield guarantee. The smp_random_resume_ suspend* tests benefit most: 8-10 s → ~2 s each. 3. TLS-hinted current_thread lookup in _tx_win32_critical_section_obtain The hot path that stamps mutex_access=TRUE previously scanned all TX_THREAD_SMP_MAX_CORES (4) virtual-core entries on every CS acquisition by a ThreadX thread. The TLS _tx_win32_current_virtual_ core index is checked first; if it matches (common case) the scan is skipped entirely. A full 4-way fallback scan is still performed when the TLS value is stale (e.g. after a type-1 scheduler hand-off), preserving the Round 2 correctness fix. Results (109/109 pass, all timings on same machine): Baseline (original): 150.3 s Round 1 (skip redundant susp): 124.8 s (-20 %) Round 2 (mutex_access type-4): 105.7 s (-32 %) Round 4 (this commit): 65.45 s (-57 %) ← new best Linux reference: 59.8 s Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Tests-Win-After5.txt captures a fresh rebuild of commit f47e102d to correct a stale-binary measurement. The earlier Tests-Win-After4.txt (65.45 s) remains valid but reflects a lucky run where the probabilistic delayed_suspension test resolved in ~50 ms instead of the typical ~11 s. The 78.24 s figure is the representative round-4 result: Baseline (original): 150.3 s Round 1: 124.8 s (-17 %) Round 2: 105.7 s (-30 %) Round 4 (typical): 78.2 s (-48 %) <- this commit Round 4 (best case): 65.5 s (-56 %) Linux reference: 59.8 s Remaining gap to Linux is dominated by timer-bound and probabilistic tests (byte_memory_thread_contention ~13.5 s, wait_abort_and_isr ~14-15 s, delayed_suspension ~0.5-11 s, timer_multiple ~6.3 s). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Attempted Win32 thread priority mapping (TX priority 0 -> BELOW_NORMAL, all others -> LOWEST) to guide OS scheduling toward higher-priority ThreadX threads. Result: 83.36s - worse than Round 4's 78.24s baseline. Root cause: the ISR resonance tests (wait_abort_and_isr, delayed_suspension) depend on precise timing equilibria. Elevating any user thread to BELOW_NORMAL disrupts these equilibria unpredictably. The timer-bound tests (byte_memory_thread_contention, timer_multiple_test) that dominate the total time cannot benefit from priority mapping. Port code reverted to Round 4 state (commit f47e102d). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replaced per-thread semaphore handshakes with Windows address waits. Added high-resolution waitable timer support and regression ISR sampling. Preserved the 100 Hz ThreadX tick cadence. Tightened SMP test and clean-build watchdog defaults. Co-authored-by: Codex (gpt 5.5) <codex@openai.com>
Updated Win32, Win64, and Win64 SMP port version metadata to 6.5.1.202602. Aligned standalone block-comment terminators in the regular Windows port headers. Co-authored-by: Codex (gpt 5.5) <codex@openai.com>
Waited for each created Windows host thread to reach the controlled run-semaphore handoff before allowing ThreadX scheduling. Guarded disable-notify builds against stale host threads entering the ThreadX shell after deletion. Co-authored-by: Codex (gpt 5.5) <codex@openai.com>
Enabled high-resolution waitable timers for the Win64 simulator and used SetWaitableTimerEx when available. Bounded Windows host-thread cleanup during thread delete/reset so stale host threads could not spin indefinitely during regression cleanup. Co-authored-by: Codex (gpt 5.5) <codex@openai.com>
Added -Clean support to the regular and SMP Windows test scripts so stale CTest Testing directories were removed before a run. Skipped Visual Studio DevShell re-entry when the active MSVC environment already matched the requested architecture. Defaulted SMP failure repeats to two attempts for timing-sensitive Windows simulator regressions. Co-authored-by: Codex (gpt 5.5) <codex@openai.com>
Removed Windows-specific timer-thread cleanup from testcontrol after the port handled stale host-thread teardown directly. Restored stricter event flag, sleep, and timer expectations where port fixes made the previous Windows accommodations unnecessary. Co-authored-by: Codex (gpt 5.5) <codex@openai.com>
Contributor
Author
|
@billlamiework and @cypherbridge This PR add brand-new Win64 ports of ThreadX and ThreadX SMP. There are now PowerShell scripts to build the code and run the regression tests for them. At this point, all tests pass reliably for both versions. Would you mind reviewing the code, please? If you want to run them, you will need Visual Studio Build Tools 2022 (I used 19.42.34436.0) or any Visual Studio installation bundling them. The code targets Windows 11 only, as other versions are obsolete at this point. I will probably update the win32 port to use the same compiler later. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.